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Abstract 

Background: Single nucleotide polymorphisms (SNPs) that reside in microRNA target sites may play an important role in 
breast cancer development and progression. To reveal the association between microRNA target site SNPs and breast 
cancer risk, we performed a large case-control study in China. 

Methods:^! e performed a two-stage case-control study including 2744 breast cancer cases and 3125 controls. In Stage I, we 
genotyped 192 SNPs within microRNA binding sites identified from the "Patrocles" database using custom lllumina 
GoldenGate VeraCode assays on the lllumina BeadXpress platform. In Stage II, genotyping was performed on SNPs 
potentially associated with breast cancer risk using the TaqMan platform in an independent replication set. 

Results:\n stage 1, 15 SNPs were identified to be significantly associated with breast cancer risk (P<0.05). In stage II, one SNP 
rs8752 was replicated at P<0.05. This SNP is located in the 3' untranslated region (UTR) of the 15-hydroxyprostaglandin 
dehydrogenase {HPGD) gene at 4q34-35, a miR-485-5p binding site. Compared with the GG genotype, the combined GA+ 
AA genotypes has a significantly higher risk of breast cancer (OR = 1.18; 95% CI: 1.06-1.31, P = 0.002). Specifically, this SNP 
was associated with estrogen receptor (ER) positive breast cancer (P = 0.0007), but not with ER negative breast cancer 
(P = 0.23), though p for heterogeneity not significant. 

Conclusion: Through a systematic case-control study of microRNA binding site SNPs, we identified a new breast cancer risk 
variant rs8752 in HPGD in Chinese women. Further studies are warranted to investigate the underling mechanism for this 
association. 
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Introduction 

Breast cancer is the most common female malignancy 
worldwide, and its incidence has been increasing during the past 
several decades in both developing and developed countries [1]. It 
is widely accepted that environmental and genetic factors 
contribute to the development of breast cancer. Despite environ- 
mental factors play an important role in breast cancer, the 
individual's risk of breast cancer was determined by the genetic 
susceptibility. Numerous investigations have suggested that micro- 
RNAs are essential for various biological processes and diseases, 
including tumorigenesis [2,3,4,5,6,7]. MicroRNA inhibits gene 
translation by binding to the 3' UTRs of target mRNAs. In recent 
years, many studies have revealed that SNPs or mutations within 
microRNA binding sites may affect cancer susceptibility by 
disrupting miRNA-niRNA interaction and mRNA expression 
[8,9,10,11]. Several bioinformatic methods have been introduced 



to predict candidate SNPs located in microRNA target sites, 
including "Patrocles", and "PolymiRTS" [12,13,14,15,16]. Some 
case-control studies have been performed to investigate the 
association between SNPs in microRNA binding sites and breast 
cancer risk [17,18,19,20,21,22]. Wang et al. found that a miRNA 
binding site SNP in the 3'-UTR region of the IL23R gene may be 
associated with the risk of breast cancer and contribute to the early 
development of breast cancer in Chinese women [23]. Teo et al. 
first reported the association between DNA repair gene PARP1 
miRNA-binding site SNP rs8679 and breast cancer risk [24]. 
Zheng et al. reported that the presence of SNPs at the miR-124 
binding site may be a marker for breast cancer risk and prognosis 
[25]. Kontorovich et al. found that the heterozygotes carriers of 
SNP rsl 1169571 had an approximately 2 fold increased risk for 
developing breast cancer, whereas heterozygotes of the rs895819 
SNP had an approximately 50% reduced risk for developing 
breast cancer [26]. Saetrom et al. suggested that allele-specific 
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regulation of BMPR1B by miR-125b explains the observed disease 
risk [27]. Brendle et al. showed that the A allele of the ITGB4 SNP 
rs743554 was associated with the negative hormone receptor 
status and bad breast cancer-specific survival, especially in women 
with more aggressive tumors [28] . However, most of them are 
candidate gene studies including only a few SNPs, which could not 
represent the whole situation of these SNPs in breast cancer 
etiology. 

In this study, high-throughput SNP genotyping was used in a 
large case-control study including genome-wide microRNA 
binding site SNPs. The results may provide new insights into the 
cause of breast cancer, and new molecular markers for breast 
cancer diagnosis. 

Materials and Methods 

Ethics statement 

The Ethics Committee of Tianjin Medical University Cancer 
Institute and Hospital approved the study protocol, and all 
patients and controls gave written informed consent before 
participating in the study. 

Study subjects 

A total of 5869 individuals (2744 breast cancer cases and 3125 
controls) were involved in this study. The cases were collected 
from Tianjin Medical University Cancer Hospital between 
January 1, 2006 and December 31, 2008 who were newly 
diagnosed and histologically confirmed breast cancer patients. At 
the same time, controls were enrolled from the nearby community, 
who were genetically unrelated to the patients and were frequency 
matched to the patients by age (±3 years). Our study comprised 
two stages, in stage I, we randomly selected 1349 patients and 
1572 cancer-free female controls for SNP screening. In stage II, to 
validate the findings from stage I, the validation set of 1 395 cases 
and 1553 controls were genotyped. Participants were interviewed 
by trained investigator using a systematic questionnaire about their 
demographic characteristics, personal habits, family history, 
occupational exposure history, eating habits, physical exercise, 
and reproductive factors. For the cases, clinical information on 
tumor features and disease severity, including morphology, tumor 
size, lymph node metastasis, organ metastasis, tumor stage, and 
status of estrogen receptor (ER) and progesterone receptor (PR) 
were also collected. Each participant provided 10 ml venous 
blood. The study was approved by the Institutional Review Board 
of Tianjin Medical University; informed consent was obtained 
from all patients. 

SNP selection 

The "Patrocles" online database (http://www.patrocles.org/) 
was used to select genome-wide micro-RNA target SNPs. Among 
all the 5035 SNPs in microRNA binding site SNPs that the 
database provided, 1742 SNPs had been confirmed. SNPs that 
satisfy the following criteria were considered for inclusion: (1) 
SNPs were located in a microRNA-seed region binding site, and 
the seed region was defined according to the "7-mirs" criteria [12]. 
(2) SNPs have reported population frequency data in Chinese 
(htpp://www.ncbi.nlm.nih.gov/snp/), and SNPs with minor 
genotype frequency &0.05 were included. In this way, a total of 
192 microRNA target SNPs were included in our study, the 
detailed information for these SNPs can be seen in Table S 1 in File 
SI. 



Genomic DNA samples 

The whole blood samples from each participants were collected 
and stored in Vacutainer tubes (BD Franklin Lakes, NJ) containing 
anticoagulant of EDTA. Total genomic DNA was extracted from 
the whole blood using QIAGEN DNA Extraction Kit (QIAGEN 
Inc.). The extracted DNA was stored at -20°C in TE buffer. 

SNP genotyping 

In stage I, SNP genotyping was conducted using Illumina 
Golden Gate SNP Genotyping Arrays according to the manufac- 
turer's instructions. Only plates with a consistent high call rate in 
the initial calling were used. If the call rate was <80%, we repeat 
the experiment. In stage II, genotyping were performed using the 
TaqMan platform in 384-well plates and read with the Sequence 
Detection Software on an ABI Prism 7900 instrument according 
to the manufacturer's instructions (Applied Biosystems, Foster 
City, CA). Primers and probes were supplied by Applied 
Biosystems, the PCR conditions used were as follows: 50°C for 2 
minutes, 95°C for 10 minutes, and 60°C for 1 minute for 40 
cycles. After 2 rounds of genotyping, the success rate for 
genotyping was 99%, and 5% of the samples were selected for 
replication, the results were 100% concordant. 

Statistical analysis 

A x 2 test was used to evaluate the differences in the distributions 
of major demographic variables and environmental risk factors, as 
well as the genotypes of selected SNPs between the breast cancer 
cases and controls. The Hardy-Weinberg equilibrium was 
determined by a % 2 goodness of fit test in controls. Unconditional 
logistic regression was used to examine the association between the 
SNPs and breast cancer risk by estimating the odds ratios (ORs) 
and 95% confident intervals (CIs), with and without adjustments 
for age, smoking status, menopause status, oral contraception use, 
history of benign breast diseases, and family history of cancer. All 
statistical tests were two-sided, and a P value of 0.05 was 
considered significant, correction for multiple comparisons was not 
performed. We used the SAS software version 9.0 (SAS Institute) 
for all statistical analyses. 

Results 

The demographic characteristics of the 2744 breast cancer cases 
and 3125 cancer-free controls (Combined stage I and II) were 
presented in Table 1. Age was matched between cases and 
controls (P = 0.447). The differences in smoking status (P <0.001), 
oral contraceptive usage (P <0.001), menopause (P <0.001), 
history of benign breast diseases (P < 0.001), and family history of 
cancer (P <0.001) were statistically significant between cases and 
controls. These characteristics were comparable between samples 
from stage I and samples from stage II (Table S2 in File SI). 

In stage I, among the 192 candidate SNPs, 15 SNPs showed a 
significant association with breast cancer risk at P<0.05. The 
associated SNPs were rsl7264436 in PBRM1 gene, rsl044145 in 
YOD1 gene, rsl325774 in CLDN10 gene, rs2654981 in IGF1R 
gene, rs7359387 in NFAT5 gene, rsl047499 in SPTBJV1 gene, 
rsl 180342 in BMP8A gene, rsl056796 in MLANA gene, rs8410 in 
PREPL gene, rsl 130741 in MPI gene, rs8752 in HPGD gene, 
rs2466551 in MFYB gene, rsl 0591 1 1 in JVEFL gene, rs2530310 in 
CJVTJVAP2 gene, and rs698761 in PREPL gene (Table 2). 

In stage II, among the 15 SNPs identified from stage I, the SNP 
rs8752 in HPGD gene (the duplex structure between miR-485-5p 
and HPGD was shown in Figure SI in File SI) was significandy 
associated with breast cancer risk in an independent replication set 
(P= 0.018). When data from stage I and stage II were combined, 
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Table 1. Characteristics of breast cancer cases and cancer-free controls (Stage I and II). 





No. (%) 


Variables 


Cases (n = 2744) 


Controls (n = 3125) 


P value ' 


OR (95% CI) 


Age 




1839 (67.0) 


2065 (66.1) 


0.447 




>55 


905 (33.0) 


1060 (33.9) 






Menopause b 


No 


1299 (47.3) 


1133 (36.7) 


<0.001 


1.00 


Yes 


1448 (52.7) 


1950 (63.3) 




0.65(0.58, 0.72) 


Oral contraception use b 


Never 


2115 (82.4) 


2583 (86.5) 


<0.001 


1.00 


Ever 


452 (17.6) 


403 (13.5) 




1.37 (1.18, 1.59) 


Smoking status b 


Never 


2311 (88.0) 


2907 (95.0) 


<0.001 


1.00 


Ever 


316 (12.0) 


153 (5.0) 




2.60(2.13, 3.18) 


History of benign breast disease b 


Never 


1995 (73.6) 


2634 (86.3) 


<0.001 


1.00 


Ever 


716 (26.4) 


417 (13.7) 




2.27(1.98, 2.59) 


Family history of cancer b ' c 


No 


1 908 (69.4) 


2475 (80.4) 


<0.001 


1.00 


Yes 


843 (30.6) 


603 (19.6) 




1.81 (1.61, 2.05) 



a Two-sided X 2 test, P< 0.05 was considered statistically significant. 

b Due to missing values, the number of patients was <2744 and that of controls was <3125 in the test set 
c First- and second-degree relatives with history of cancer. 
doi:1 0.1 371 /journal.pone.01 02093.t001 



compared with rs8752 GG genotype, GA and AA genotypes were 
associated with a higher risk of breast cancer (OR = 1 . 1 8; 95 % CI: 
1.06-1.31, P = 0.002). Furthermore, we assessed these associations 
between rs8752 and breast cancer risk according to ER and PR 
status. The association was significant for ER positive breast 
cancer (OR = 1.24; 95% CI: 1.10-1.40, P = 0.0007), but not 
significant for ER negative breast cancer (OR= 1.09; 95% CI: 
0.95-1.26, P = 0.23), though p for heterogeneity not significant. 
The association between rs8752 and breast cancer risk was similar 
for PR positive and PR negative breast cancer (Table 3). 

Discussion 

In this study, we performed a two-stage case-control study 
including 2744 cases and 3125 controls. Among the 192 SNPs 
genotyped, the SNP rs8752 (A allele) in HPGD gene was identified 
to be associated with an increased risk of breast cancer in both 
stages. This study provided a piece of evidence for a novel 
susceptibility variation for breast cancer on chromosome 4q34-35. 

Our study covered three SNPs (IGF1R rs2654981, NFAT5 
rs7359387, NELF rs 10591 11) that were previously studied in the 
context of breast cancer risk. The IGF1 receptor (IGF1R) 
overexpression has been associated with a number of hematolog- 
ical neoplasias and solid tumors including breast cancer [29] . Han- 
Sung Kang found that seven of the 5 1 IGF1R SNPs were in LD 
(linkage disequilibrium) and in one haplotype block, and were 
likely to be associated with breast cancer risk [30]. Sebastien 
Jauliac found that MFAT5 were expressed in invasive human 
ductal breast carcinomas and participate in promoting carcinoma 
invasion using cell lines derived from human breast and colon 
carcinomas [31]. JVEFL has been shown to act as a tumor 
suppressor in the carcinogenesis of breast [32,33]. However, we 



found significant association between these three SNPs and breast 
cancer risk only in stage I. In stage II (the validation set),we did not 
find significant association. 

The microRNA-related SNPs can generally be categorized into 
three groups, SNPs in microRNA sequences, SNPs in microRNA 
biogenesis pathway genes, and SNPs in microRNA target sites 
[34,35,36]. Up to now, SNPs in microRNA sequences and 
microRNA biogenesis pathway genes had been systematically 
studied and important findings were reported from these studies 
[37,38]. However, for the association between microRNA binding 
site SNPs and breast cancer risk, most previous studies were based 
on candidate gene strategy. Results from these studies were not 
enough to represent the role of such SNPs in the etiology of 
cancer. In this sense, we conducted a systematic case-control study 
including genome-wide microRNA binding site SNPs. 

The HPGD gene at chromosome 4q34-35 encodes a short-chain 
non-metalloenzyme alcohol dehydrogenase protein family. The 
encoded enzyme is responsible for the metabolism of prostaglan- 
dins, which function in a variety of physiologic and cellular 
processes, such as inflammation. HPGD is widely distributed in 
various mammalian tissues such as lung, breast, prostate, placenta 
and gut. Recent studies have shown a reduction of HPGD in some 
cancers, such as colorectal, breast, prostate, and lung 
[39,40,41,42,43]. Many studies have revealed that HPGD may 
have tumor-suppressive properties [44,45]. Ido Wolf dial, reported 
that HPGD was an epigenetically silenced tumor suppressor gene 
in breast cancer and there was an association between HPGD 
expression and the ER pathway activity. Prostaglandin E2 (PGE2) 
is a major stimulator of expression of aromatase, thus leading to 
increased synthesis of estrogen within the breast [40] . PGE2 levels 
are regulated not only by its synthesis but also by its degradation. 
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Table 3. Logistic regression analysis of associations between rs8752 and the risk of breast cancer (Stage I and II). 





No. (%) 


Polymorphisms 


Cases (n = 2735) 


Controls (n = 3114) 


P value 


OR (95%CI) 


Adjusted OR (95%CI) a 


Rs8752 (G/A) (Stage 1) 


GG 


567 (42.09) 


715 (45.51) 


0.038 


1.00 


1.00 


GA 


642 (47.66) 


675 (42.97) 




1.20 (1.03, 1.40) 


1.22 (1.05, 1.44) 


AA 


138 (10.24) 


181 (11.52) 




0.96 (0.75, 1.23) 


0.97 (0.76, 1 .26) 


GA+AA 


780 (57.91) 


856(54.49) 


0.050 


1.15 (0.99, 1.33) 


1.17 (1.01, 1.35) 


Rs8752 (G/A) (Stage II) 


GG 


578 (41.64) 


716 (46.40) 


0.018 


1.00 


1.00 


GA 


654 (47.12) 


686 (44.46) 




1.18 (1.01, 1.38) 


1.19 (1.02, 1.40) 


AA 


156 (11.24) 


141 (9.14) 




1.37 (1.06, 1.77) 


1.38 (1.08, 1.79) 


GA+AA 


810 (58.36) 


827 (53.60) 


0.012 


1.21 (1.05. 1.40) 


1.22 (1.06, 1.41) 


Rs8752 (G/A) (Combined) 


GG 


1145 (41.86) 


1431 (45.95) 


0.006 


1.00 


1.00 


GA 


1296 (47.39) 


1361 (43.71) 




1.19 (1.07, 1.33) 


1.20 (1.08, 1.34) 


AA 


294 (10.75) 


322 (10.34) 




1.14 (0.96, 1.36) 


1.15 (0.97, 1.38) 


GA+AA 


1590 (58.14) 


1683 (54.05) 


0.002 


1.18 (1.06, 1.31) 


1.19 (1.07, 1.32) 


Rs8752 (G/A) (ER-) 


GG 


443 (43.77) 


1431 (45.95) 


0.23 


1.00 


1.00 


GA+AA 


569 (56.23) 


1683 (54.05) 




1.09 (0.95, 1.26) 


1.10 (0.96, 1.28) 


Rs8752 (G/A) (ER+) 


GG 


631 (40.71) 


1431 (45.95) 


0.0007 


1.00 


1.00 


GA+AA 


919 (59.29) 


1683 (54.05) 




1.24 (1.10, 1.40) 


1.26 (1.08, 1.45) 


Rs8752 (G/A) (PR-) 


GG 


491 (41.54) 


1431 (45.95) 


0.009 


1.00 


1.00 


GA+AA 


691 (58.46) 


1683 (54.05) 




1.20 (1.05, 1.37) 


1.17 (1.03, 1.34) 


Rs8752 (G/A) (PR+) 


GG 


583 (42.22) 


1431 (45.95) 


0.02 


1.00 


1.00 


GA+AA 


798 (57.78) 


1683 (54.05) 




1.16 (1.02, 1.32) 


1.19 (1.05, 1.36) 



a ORs were adjusted for age, smoking status, menopause status, oral contraception use, history of benign breast diseases, and family history of cancer. 
doi:1 0.1 371 /journal.pone.01 02093.t003 



The key enzyme responsible for the biological inactivation of 
prostaglandins is NAD+-linked HPGD [41]. Our results add 
another dimension to the above findings that the A allele of HPGD 
had a positive association with breast cancer risk, and the 
association was ER status specific. SNP rs8752 (G/A) is located 
in the miR-485-5p binding site, and it is likely to disrupt the miR- 
485-5p/HPGD interaction. As shown in Figure SI in File SI, the 
A allele cannot be targeted by miR-485-5p, which will result in the 
increase of HPGD protein expression, a possible underlying 
mechanism for the observed association with the risk of breast 
cancer. 

Although, to the best of our knowledge, this is the largest 
systematic case-control study investigating the association between 
microRNA target SNPs and breast cancer risk. Our study has 
several limitations. First, we included only the SNPs with high 
frequency of variation, namely three genotypes with minor 
genotype frequency &0.05. This strategy will inevitably miss some 
low frequency SNPs that associated with breast cancer risk. 
Second, functional studies are critical to confirm the findings of 
association from this study, while such studies were not performed 
at this stage. Third, correction for multiple comparisons was not 
performed in this study, although our design with large sample size 
and replication set can ensure a high repeatability of our findings. 



In summary, our findings suggested that common variants in 
the HPGD gene might be associated with breast cancer risk among 
Chinese women. Further large studies are warranted to confirm 
these findings and to examine the biological mechanisms for the 
association. 

Supporting Information 

File SI Supporting file including Figure SI, Table SI, 
and Table S2. Table SI. 192 microRNA binding site SNPs 
identified from "Patrocles" database. Table S2. Characteristics of 
breast cancer cases and cancer-free controls (Stage I and Stage 
II). Figure SI. The duplex structure of miR-485-5p and the 
3'UTR of HPGD gene. 
(DOCX) 
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